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ABSTRACT 

Parallel studies at universities in Israel, Finland, 
Canada, and the United States used a "selected deletion gap-filling 
test," a variation on the cloze procedure designed to measure reading 
comprehension by testing the reader's familiarity with cohesive links 
and grasp of text coherence. The test design responded to the growing 
demand for a more efficient multiple-choice test of reading 
comprehension for large numbers of examinees. The study used 
discourse analysis to select the deletions and to examine their 
relationships to macro-structures in the text. Pilot testing with the 
varied university populations revealed the test to be statistically 
satisfactory. Recommendtiopc aic Lnat the test be used as: (1) a 
diagnostic test or as a test of language proficiency, with its 
deletions adapted to its function; and (2) part of a battery of tests 
measuring other aspects of language proficiency. Further research to 
expand the scope of the test is under way. (Author/MJE) 
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ABSTRACT 

To measure reading comprehension, teachers often ask students to 
fill in gaps in a text. Tnis basic test format has many variations. 
Traditionally, tne test in wnich deletions occurred after every nth 
word was termed 'cloze procedure.' Later, it was called a 'random 
cloze' test, to be distinguished from the rational cloze, in which the 
test designer decided which words to omit from the text. To avoid 
confusion, we have termed our test SeDelGap, ' selected deletion 
gap-filling test.' 

Tnere is controversy about exactly what is measured by a cloze 
test. Some researchers (Oiler, Bormuth, Jonz) have claimed that it is 
a global measure of reading comprehension, while others (Alderson, 
Porter, Klein-Braley) nave argued that it merely shows a limited 
knowledge of collocations on the micro-level. A discourse cloze, 
deleting only cohesive markers (i.e., pronoun anaphora and 
conjunctions), was described by Levenston, Nir, and Blum-Kulka (198^). 
Tne Cohesion Cloze (Bensoussan, forthcoming) is based on the same 
principle and claims tnat the blanks are independent of each other. 

Tne SeDelGap principle measures reading comprehension on the 
macro-level, testing tne reader's familiarity with cohesive links and 
grasp of text coherence. Discourse analysis is used to select 
deletions and to examine their relations to macro-structures in the 
text. 
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To this end, parallel studies at different universities are being 
carried out with macro-level gap-filling tests. In one series of 
experiments, tnree texts, designed by Bensoussan and otner researchers 
(Elizlabeth Tricomi at SUNY Binghamton and Randall Uenara at Harvard 
Summer Scnool in ESL) were administered to university students in 
Israel (Haifa University), Finland (University of Helsinki), Canada 
(OISE), and tne USA (SUNY Binghamton and Harvard Summer Schoolin ESL) 
during the years 1981-1987. Item analysis showed the tests to be 
statistically satisfactory. 

Test researchers at the University of Helsinki have been 
experimenting independently with the 'semantic cloze* based on similar 
principles. Simultaneous testing research in different universities 
for speakers of many different native languages is exciting. It is 
noped tnat tne SeDelGap principle can be used to generate more 
researcn and better tests. 
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I. INTRODUCTION 

Tne SeDelGap Test (Selected Deletion Gap-Filling Test) was designed 
in response to the growing demand for a more efficient multiple-choice 
test of reading comprehension for large numbers of examinees. Most 
multiple-choice tests are inefficient in that, for tne amount of 
reading required, tney yield relatively few questions. Moreover, 
comprenension of the quer-tions adds an additional component to tne 
test. Resulting from both text and questions, scores do not 
necessarily reflect readers* comprehension of the text alone. 
Questions may also reflect the examiner^s interpretation of the text, 
thus biasing results. The Cloze procedure, although claiming to test 
reading comprehension, has been criticized for testing readers* 
micro-level familiarity with a limited range of collocations and 
idioms rather than examining macro-level comprehension of the writer *s 
ideas and opinions in the text. Tne SeDelGap Test aims to combine the 
multiple-choice and cloze techniques to test reading comprehension on 
the macro-level. 

A summary of tne development of cloze research would be helpful in 
explaining tne rationale behind the SedelGap Test. 
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II- REVIEW OF THE LITERATURE 

A. Cloze Tests as a Measure of Readability 

1. Definition of 'Cloze' 

Tne term 'cloze' first appeared in an article by Taylor (1953) as a 
better measure of readability than readability formulas. Taylor 
recommended random deletion to sample the ability of a reader to 
comprehend a text. Bormuth (1966) repo-ts high correlations between 
readability formulas and cloze passages. 

Oiler and Conrad (1972) explain the reasoning behind the 'cloze' 
procedure : 

Tne term 'cloze' was used with the notion of Gestalt 
"closure" in mind, referring to the natural human 
psycnological tendency to fill in gaps in patterns. The 
restoration of words deleted from a selection of prose in 
order for the passage to make sense is a special use of this 
ability to complete broken patterns, (p. 183) 

Carroll (1972) explains Taylor's procedure: 

The procedure involves taking a passage of text and deleting 
words in it by some rule, e.g., every 5th word, every other 
noun, or every other "function" word. A subject is then 
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presented with the passage and asked to guess the ini?^sing 
words, (p. 18) 

Altnougn linguistic criteria (parts of speech or function words) 
may enter into this early cloze procedure, it is the randomness, and 
not tne structure of tne text, that counts. From this automatic, 
mechanical deletion process sprang a whole literature which applied 
tnis procedure to a large variety of texts and students, making claims 
for its performance, criticizing its effectiveness, and suggesting 
modifications in scoring methods and deletion rates. 

2. Reasons for Advocating the Cloze 

A number of researchers (Taylor 1953 and 1956, Gilliland 1970, 
Hirscn 1977) see the cloze procedure as an accurate measure of 
readability for two reasons: it includes the reader, and it makes use 
of semantic and syntactic redundancy in tne text (i.e., the context) 
in the calculation of the readability score. That is, it corrects 
some of the faults of the readability formulas. 

Redundancy, as defined by Klare (1963) refers to "tne extent to 
which a given unit of language is determined by nearby units*^ (p. 
172). Like readability, perception of redundancy varies not only witn 
the materials^ but also with the readers (Klare 1963, pp. 173-17^). 

Some researcners claimed tnat tne cloze is a global measure of 
language proficiency for native speakers of English (VJeaver 1962 and 
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1965, Bormutn 1967 and 1968, Ramanauskas 1972, Oiler 1975, and Ozete 
1977). Tnis claim was soon extended to include nonnative speakers of 
English as well (Oiler und Conrad 1972, Oiler 1973, Irvine et al. 
197^, Stubbs and Tucker 197^, Jonz 1976, Cninara et al. 1977, Berkoff 
1979). 

Finally, the SeDelGap test yields more items per minute than the 
traditional multiple-choice test. VTnereas a text of ^00 words might 
yield only ten multiple-choice questions in thirty minutes, it would 
yield approximately thirty SeDelGap items in tne same amount of time. 
Tnus the SeDelGap test would give the language teacher more 
information than traditional multiple-choice items about students* 
reading comprehension during the same amount of test time. It would 
also increase tne testes reliability since statistical reliability 
increases witn the number of test items. 

3. Problems with the Random Cloze Procedure 

Not all researchers, are entnusiastic about the cloze as a global 
measure of reading proficiency, however. Kintsch and Vipond (1977) do 
not believe redundancy and readability to be closely related: 



ETAI 

8 

Tne cloze procedure, on the other hand, is probably actually 
misleading. It measures the statistical redundancy of a 
text, whicn is a far cry from its comprehensiblity • By that 
score, a high-order statistical approximation to English 
tnat nevertneless constitutes incomprehensible gibberish 
would be preferred to a well-organized text with less 
predictable local patterns, (p. 337) 

Otner researchers are also skeptical of the random cloze procedure 
(Carroll 1972; Porter 1975; Alderson 1969, 1979, and I98O; Baten 198I; 
and Klein-Braley 198I). 

Opponents of the random cloze present a list of drawbacks. Tney 
state that it does not measure what its promoters say it does. 
Language production being necessary, it is not only a measure of 
reading ability (Porter 1975). Changes in deletion rates can alter 
the test unpredictably, so that it cannot be universally applied to 
every text (Alderson 1969, 1979, and 198O; Klein-Braley 198I). It is 
not a test of global comprehension across sentence boundaries but a 
discrete item test that is sentence (or oven clause) bound (Alderson 
1969, Carroll 1972, Klein-Braley 1981). Random cloze tests do not 
always distinguish between natives and nonnatives (Alderson 198O) 
since even natives also have difficulty filling in the cloze and are 
not necessarily able to get a perfect score (as would normally be 
expected on a test for foreign language learners). 
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Modified Rational Cloze 

Having rejected tne random cloze as not being an automatically 
valid testing procedure, a number of researcners suggested rational 
deletion metnods according to linguistic principles (Heaver 1962, 
Greene 1965 t Alderson 1969 t Crsnney 1972-73f Klein-Eraley 1981 , 
BensOwSsan and Ramraz 198^). 

Tne rational cloze procedure can be used as a measure of text 
difficulty for a certain population. Depending on the placement of 
blanks, different kinds of tests can be obtained. Blanks con be put 
in place of content words, function words, parts of speech, markers of 
conesion; these words can be used to test comprehension on the 
micro-level or the macro-level. Even tne rational cloze does not 
necessarily test the student's grasp of tne content or ideas in the 
text, nowever, 

Greene (1965) explains the rationale behind a modified cloze test 
wnicn he constructed: 



erJc 



10 



ETAI 

10 

eacn possible deletion was evaluated by the autnor for 
possible effectiveness and deletions made on this rational 
rather than mechanical basis. For eacn word deleted under 
tne modified cloze procedure, there was felt to be 
sufficient redundancy remaining in the passage so that a 
superior reader could make positive identification of tne 
missing word. (pp. 213-21^) 

Otner researchers advocate deleting certain parts of speech (Weaver 
1962, Klein-Braley 1981) or a certain percentage of content vs. 
function words (Berkoff 1979). 

Working witn nonnatives, Bachman (1982) deleted on the basis of 
syntactic (clause-level context), cohesive (inter-clauso or 
inter-sentential context), or strategic (parallel) patterns of 
conerence (p. 63). Also working with EFL students, Berkoff (1979) and 
Sim (1979) experimented with rational cloze to test comprehension of 
items of coherence and cohesion. 

Using the rational cloze to measure reading comprehension, the 
researcher could use Greeners (1965) criteria in determining 
deletions. Tne resulting cloze tests should contain sufficient 
redundancy to make sense to the competent native reader. 

A multiple-choice modification of the rational cloze procedure was 
designed by Bensoussan and Ramraz (198^^). Tne basic advantage of this 
metnod over the multiple-choice test is that tne correct answer does 
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not reflect tne tester's interpretation of tne text, but is an 
integral part of the text. Trie basic advantage over standard cloze 
procedure is that the focus is on recognition, not production. That 
is, tne fc^us is on reading uncontaminated by tne element of writing. 

5. Cloze Procedures on the Discourse Level 

Recently there have been two efforts at tapping comprehension on 
tne discourse level by means of a modified cloze procedure. Tne 
Discourse Cloze (Levenston, Nir, and Bluti-Kulka "598^) deleted only 
overt cohesion markers of co-reference and connectives between 
propositions. Tney assumed that the correct completion of macro-level 
items deleted from a text indicates understanding of the whole 
discourse. 

Tne second effort is the Semantic Cloze (Hauranen 1988) to test 
comprehension of advanced, academic texts. It aimed to delete more 
macro-level than micro-level words, deleting content words (nouns, 
verbs, adjectives, and adverbs) and discourse markers, also including 
alternative responses for each deletion. It has the advantage , ver a 
standard cloze of actually avoiding the micro-level bias. There is 
also evidence (Hauranen 1988) to suggest that the test is sensitive to 
cnanges in students' reading comprenension skills. 
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III. SeDelGap TEST PRINCIPLES 
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Expository texts used in the SeDelGap procedure should be 
culturally and tnematically neutral but still interesting. They must 
be culturally neutral so as not to disadvantage any particular group 
of students. Tnematic neutrality ensures objectivity; when students 
read a text that clashes with tneir own personal, religious, or 
political opinions, they become too impatient or irritated to fully 
comprenend tne writer *s point of view. Instead, they tend to impose 
their own opinions or schemata on the text. If the text is dull, on 
the other hand, students become bored. All these deviations interfere 
witn an objective assessment of students* reading comprehension and 
generally result in artificially low test scores. 

The texts should also present new information to the students, even 
if it is in a familiar content area. Tnis is not only to avoid 
boredom, but to tap the students* ability tc cope with new meanings 
expressed in a foreign code. 
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B. Selection of Deletions 

A blank may take the place of one or more words so that the meaning 
is recoverable from the text. Textual clues derive from redundancy, 
collocations, denotations and connotations (negative and positive), 
and opposites in tne text. Each blank in the text should be 
conceptually and linguistically independent of the others. 

To restore a deletion, students should have to read beyond the 
clause in which the deletion appears* The SeDelGap procedure is based 
on deletions which can be completed by knowledge of the ideas and 
structure of tne text on the macro-level. It tests familiarity with 
the coherence and cohesion of the text* Logical relations of 
coherence would include general/specific (e.g., example) , 
cause/effect, contrast/comparison, addition, series, parallel ideas, 
analogy/metaphor/simile. The writer's attitude or intention is 
anotner macro-level construct that should be tested. Cohesive markers 
include pronouns (e.g., it, hers, this), substitution (one, do, so, 
not), sentence connectors (conductions), and lexical cohesion: 
repetition, near-synonyms, superordinates and subordinates. Deletions 
wnich can be completed by micro-level knowledge of the immediate words 
surrounding the text are more likely to be testing grammar than 
comprehension of ideas, and are best avoided. 

Wnen the SeDelGap procedure also includes the element of 
multiple-choice distractors, each deletion must be able to yield 
alternate answers. Tne blank may be situated in such a way as to test 
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an idea, pattern, or structure directly (i.e.^ the meaning of a 
particular word) or indirectly (i.e., a word tapping a logical 
relation su-h as contrast, exemplification, cause/effect) • 

C. Selection of Multiple-Choice Alternate Responses 

Each deletion should test only one point of coiu^^renension (e,g,, 
idea, verb tense, sequencing, contrast, content/function). It is 
desirable for all responses to be parallel in form (i.e., all 
adjectives, gerunds, conjuctions, etc.) and register. Opposites can 
be good distractors. 

All alternate responses must be grammatically correct since the 
SeDelGap does not test grammar but reading comprehension. To find the 
correct response, the reader needs to make use of the context 
surrounding the blank. 

Since this is a foreign language test, it is best to avoid unusual 
words, and fine semantic and syntactic distinctions which may confuse 
even the native speaker (e.g., its vs. it*s). 
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IV, COMPARISON OF SeDelGap ^SSTS 

Recently, parallel studies at different universities have b-^en 
carried out with macro-level gap-filling tests. In one series of 
experiments, tnree texts, designed by Bensoussan and otner researchers 
(Elizabetn Tricomi at SUNY Binghamton and Randall Uehara at Harvard 
Summer School in ESL) were administered to university students in 
Israel (Haifa University), Finland (University of Helsinki), Canada 
(OISE), and tne USA (SUNY Binghamton and Harvard Summer School in ESL) 
during the years 1981-1987. Item analysis showed the tests to be 
statistically satisfactory (see Table). 
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TABLE: SeDelGap Test Collaboration 
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V. FURTHER IMPLICATIONS FOR TEACHING AND TESTING LANGUAGE 

Tne SeDelGap may be used as a diagnostic test or as a test 
language proficiency. Tne test designer can select blanks and 
alternate responses according to the function of tne test* 

What tne SeDelGap, in this form, does not test is readers* 
independent and critical interpretation of the main points of a text. 
To remedy this situation, we have been experimenting with a 
combination of question types, with ordinary general comprehension 
questions at the beginning and/or end of the undeleted part of the 
text, and deletion.^ in most of the middle part of the text. This 
mixed format nas worked well and has not been confusing to students. 

Tne SeDelGap would be useful in a battery of tests, also including 
testing formats sucn as written summary and oral interview, each 
focusing on a different aspect of language proficiency. 

Simultaneous testing research in different universities for 
speakers of many different native languages is exciting. It is hoped 
tnat the SeDelGap principle can be used to generate more research and 
better tests. 
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